Behavioral Data Mining

نویسندگان

  • Daniel Aranki
  • Alice Wang
  • Avital Steinitz
چکیده

In this paper, we describe the design considerations, the implementation details and the results of performing distributed K-Means clustering on a snapshot of Wikipedia of around 13 million documents. The design and implementation is based on the MapReduce programming paradigm. We use the MapReduce implementation provided by Apache Hadoop [1]. The running of our algorithms took place on the UC Berkeley ICluster [2].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic segmentation and ranking approach of customers and identifying their behavioral mobility using data mining techniques in Kargaran Welfare Bank

Nowadays, identifying, determining the value and segmentation of customers is essential for a bank. Dynamic classification of workers' welfare bank customers and identification of their behavioral mobility between different departments in a specific period of time using data techniques Kaveh. In this regard, transaction data of customers of this bank was considered as a statistical community. I...

متن کامل

Sessionization –A Vital Stage in Data Preprocessing of Web Usage Mining-A Survey

The World Wide Web has impacted on almost ever aspects of our lives in modern era. The Web has many unique characteristics and which make mining useful information and knowledge a challenging task. Web mining uses many data mining techniques but it is not an application of traditional data mining due to heterogeneity and unstructured nature of the data on Web. Web mining tasks can be categorize...

متن کامل

Web Usage Mining: Pattern Discovery and Forecasting

Web usage mining: automatic discovery of patterns in clickstreams and associated data collected or generated as a result of user interactions with one or more Web sites. This paper describes web usage mining for our college log files to analyze the behavioral patterns and profiles of users interacting with a Web site. The discovered patterns are represented as clusters that are frequently acces...

متن کامل

Data Mining in Educational System using WEKA

Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential used in various commercial applications including retail sales, e-commerce, remote sensing, bioinformatics etc. Education is an essential element for the progress of country. Mining in educational environment is called Educational Data Mining. Educational data min...

متن کامل

An Artificial Life Approach to Data Mining

In this paper we describe a novel approach to Data Mining: artificial life forms, called DataBots, simulated in a computer show collective behavioral patterns that correspond to structural features in a high dimensional input space. Movement strategies for DataBots have been found and tested on a real world data set. Important structural properties could be found and visualized by the collectiv...

متن کامل

A Framework for Discovery and Diagnosis of Behavioral Transitions in Event-streams

Date stream mining techniques can be used in tracking user behaviors as they attempt to achieve their goals. Quality metrics over stream-mined models identify potential changes in user goal attainment. When the quality of some data mined models varies significantly from nearby models—as defined by quality metrics—then the user’s behavior is automatically flagged as a potentially significant beh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012